Abstract: BotometerLite is advertised as a lightweight bot detector that improves scalability by focusing on only user profile information; furthermore, BotometerLite claims that using fewer features only entails a small compromise in individual accuracy. We test the validity of this claim by comparing Botometer with BotometerLite bot likelihood scores for 10,000 randomly sampled users. BotometerLite scores varied drastically from Botometer scores.

Introduction

Botometer is one of the most popular bot detection tools used in social science Rauchfleisch and Kaiser (2020). Botometer was initially launched in May 2014 and BotometerLite was released in September 2020. The training and performance evaluation of BotometerLite is described in “Scalable and Generalizable Social Bot Detection through Data Selection” Yang et al. (2020).

Rauchfleisch and Kaiser (2020) found Botometer scores are imprecise at estimating bots, especially in a different language, and prone to variance over time a high number of human users as bots and vice versa.

In this study, we seek to answer the following questions:

  • How similar are Botometer and BotometerLite ratings?
  • Is BotometerLite effective at identifying specific types of bots (e.g., spammers, fake followers, etc.)?
  • Can BotometerLite be used as an triage tool to identify a subset of accounts that require more extensive evaluation via Botometer?

Bot Type Scores

Bot scores describe how much an account acts like a specific kind of bot. https://botometer.osome.iu.edu/faq

  • Astroturf: manually labeled political bots and accounts involved in follow trains that systematically delete content
  • Fake follower: bots purchased to increase follower counts
  • Financial: bots that post using cashtags
  • Self declared: bots from botwiki.org
  • Spammer: accounts labeled as spambots from several datasets
  • Other: miscellaneous other bots obtained from manual annotation, user feedback, etc.

Complete Automation Probability describes the probability, according to the Botometer model, that an account with this score or greater is at bot.

Methodology

  1. Randomly sample n accounts from Twitter API.
  2. Collect Botometer and BotometerLite bot likelihood scores.
  3. Calculate correlation between scores.

Results

Raw English Scores

BotometerLite is most similar to the Botometer fake follower and spammer scores with \(R^2\) values of 0.394 and 0.334, respectively. Hence, if Botometer scores are accurate, BotometerLite may be somewhat effective at identifying some fake followers and spammers.

The pearson correlation matrix (\(R^2\) values are the square of the values of this matrix) also shows the scores are weakly correlated.

#

Raw Universal Scores

Conclusion

Future work for course project:

  • Update introduction to include other articles that have critiqued Botometer
  • Replicate results of Indiana University BotometerLite paper (Train a classifier to predict manually labeled bots and compare with BotometerLite)
  • Post code to github repo
  • Submit a post to Medium?

References

Rauchfleisch, Adrian, and Jonas Kaiser. 2020. “The False Positive Problem of Automatic Bot Detection in Social Science Research.” Berkman Klein Center Research Publication, nos. 2020-3.

Yang, Kai-Cheng, Onur Varol, Pik-Mai Hui, and Filippo Menczer. 2020. “Scalable and Generalizable Social Bot Detection Through Data Selection.” In Proceedings of the Aaai Conference on Artificial Intelligence, 34:1096–1103. 01.